Filtering structured documents in the SYNDOC environment

نویسندگان

  • E. KUIKKA
  • A. SALMINEN
چکیده

This paper describes the filtering approach for searching documents whose structure is defined by a grammar. The method is based on the theoretical model for defining filters to specify information interest of a user. It is employed to find documents in SYNDOC, a syntax-directed text processing system. The method is suitable, for example, for SGML and ODA documents. The user selects a grammar and indexes only documents for the selected grammar. A filter generated in a syntax-directed way using the grammar describes conditions for indexed documents integrating structure and content constraints. The user compares a filter with indexed documents and, either edits, browses or prints original documents using the selected output form. Indexed documents, filters and retrieved documents can be stored for further purposes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementation of Two - dimensional Filters for StructuredDocuments

Filtering is used to select a subset, corresponding to the information interests of a user, from a set of information items. The information interests are described in a lter which is created to control the selection. In our earlier work we have described a theoretical framework for specifying lters to express content-based and structure-oriented constraints on structured text. In the lters, th...

متن کامل

pFilter: Global Information Filtering and Dissemination Using Structured Overlay Networks

The exponential data growth rate of the Internet makes it increasingly difficult for people to find desired information in a timely fashion. Information filtering and dissemination systems allow users to register persistent queries called user profiles, and notify users when relevant files become available. Existing such systems, however, either are not scalable, or do not support matching of u...

متن کامل

Information extraction for semi-structured documents

The number of unstructured or semi-structured documents produced in all types of organizations continues to increase rapidly. Cost-effective ways of finding the relevant ones and extracting useful information from them are increasingly important to a large number of enterprises for operational and decision-support applications. The approach discussed in this paper constitutes a suitable basis f...

متن کامل

Learning from Labeled Features for Document Filtering

Existing document filtering systems learn user profiles based on user relevance feedback on documents. In some cases, users may have prior knowledge about what features are important. For example, a Spanish speaker may only want news written in Spanish, and thus a relevant document should contain the feature“Language: Spanish”; a researcher focusing on HIV knows an article with the medical subj...

متن کامل

تدوین الگوی کیفی مدیریت فرآیندهای یاددهی– یادگیری در دوره ابتدایی

The aim of study was to develop a qualitative model of teaching-learning processes management(QMTLPM)for the elementary school classrooms .According to the study, criteria and indicators for QMTLPM through in-depth study of available sources and interviews with focus groups were identify.in term of design is qualitative and in term of strategy is qualitative case study research. Potential study...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995